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Abstract. Type analyses of logic programs which aim at inferring the 
types of the program being analyzed are presented in a unified ab- 
stract interpretation-based framework. This covers most classical ab- 
stract interpretation-based type analyzers for logic programs, built on 
either top-down or bottom-up interpretation of the program. In this set- 
ting, we discuss the widening operator, arguably a crucial one. We present 
a new widening which is more precise than those previously proposed. 
Practical results with our analysis domain are also presented, showing 
that it also allows for efficient analysis. 

1 Introduction 

In type analyses, the widening operation has much influence in the results. If the 
widening is too aggressive in making approximations then the analysis results 
may be too imprecise. On the other hand, if it is not sufficiently aggressive then 
the analysis may become too inefflcient. 

Widening operators are aimed at identifying the recursive structure of the 
types being inferred. All widenings already proposed in the literature are based 
on locating type nodes with the same functors, which are possible sources of 
recursion. However, they disregard whether such nodes come in fact from a 
recursive structure in the program or not. This may originate an unnecessary 
loss of precision, since the widening result may then impose a recursive structure 
on the resulting type in argument positions where the concrete program is in fact 
not recursive. We propose a widening operator to try to remedy this problem. 

We present our widening operator for regular type inference in an analy- 
sis framework based on abstract interpretation of the program. In order for the 
paper to be self contained, we first revisit regular types (Section |2|) and, in partic- 
ular, deterministic ones. We focus on deterministic types for ease of presentation; 
however, there is nothing in our widening which prevents it to be applicable also 
to non-deterministic types. The abstract interpretation framework is set up in 
Section ^. Section ^ reviews previous widenings in the literature, and Section 5 
presents ours. In Section ^ experimental results are presented, and Section 7 
concludes and discusses future work. 

^ In Alexandre Tessier (Ed), proceedings of the 12th International Workshop on Logic 
Programming Environments (WLPE 2002), July 2002, Copenhagen, Denmark. 
Proceedings of WLPE 2002: http://xxx.lanl.gov/html/cs/0207052 (CoRR) 
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2 Regular Types 

A regular type Q is a type representing a class of terms that can be described 
by a regular term grammar. A regular term grammar, or grammar for short, 
describes a set of finite terms constructed from a finite alphabet J- of ranked 
function symbols or functors. A grammar G = {S, T, TZ) consists of a set of 
non-terminal symbols T, one distinguished symbol S E T, and a finite set TZ of 
productions T — *■ rhs, where T £ T is a non-terminal and the right hand side 
rhs is either a non- terminal or a term /(Ti, . . . ,Tn) constructed from an n-ary 
function symbol f E T and n non-terminals. 

The non-terminals T are types describing (ground) terms built from the func- 
tors in T . The concretization 7(r) of a non-terminal T is the set of terms deriv- 
able from its productions, that is, 

7(T) = IJ 7(r/is) 

(T >rhs)e1Z 

-fifiT,, . . . , T„)) - {/(tl, . . . , i„) I e 7^)} 

The types of interest are each defined by one grammar: each Tj is defined by 
a grammar (Ti,Ti,J^,TZi), so that for any two types of interest Ti and T2 on J^, 
71 n 7^ = 0. Sometimes, we will be interested in types defined by non-terminals 
of a grammar (T, T, J-^, TZ) other than the distinguished non-terminal T. This is 
formalized by defining a type Ti G T as the grammar 

(r„ {T e r I T, '■^''^ T}, {{T rhs) eTZ\T, ''^'^ T}) (1) 

where all the non-terminals are renamed apart, ^'^'^^ is the reflexive and transi- 
tive closure of '^^^■ji and ""^^-j^ Tj iff Ti — Tj or Ti — >tc /(..., T,-, .. .). 

A grammar is in normal form if none of the right hand sides are non- 
terminals. A particular class of grammars are deterministic ones. A grammar 
is deterministic if it is in normal form and for each non-terminal T the function 
symbols are all distinct in the right hand sides of the productions for T. 

Deterministic grammars are less expressive than non-deterministic ones. De- 
terministic grammars can only express sets of terms which are tuple-distributive; 
informally speaking, which are "closed under exchange of arguments". I.e., if 
the set contains two terms of the same functor, then it also contains terms with 
the same principal functor obtained by exchanging subterms of the previous 
two terms in the same argument positions. Basically, no dependencies between 
arguments of a term can be expressed with deterministic grammars. 

Example 1. Consider the type T denoting the set {/(a, 6), /(c, d)}, which is non- 
deterministic, 

T — > f{A, B) A — >a C — > c 
T — > f lc D) B — >h D — >d 
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A deterministic type T' with a concretization which included ^{T) would also 
have to include {/(c, 6), /(a, d)}, that is, 

T — > /(AC, BD) AC — >a BD — >h 
AC — > c BD — >d 

To facilitate the presentation non-terminals with a single production will often 
be "inlined" and multiple right hand sides combined so that T above will be 
written T — > f{a,b) \ f{c, d) and T' as 

T' — > f{AC, BD) AC — > a I c BD — >h\d 

To be able to describe terms containing numbers and variables we introduce 
two distinguished symbols num and any, plus an additional _L. The concretiza- 
tion of num is the set of all numbers, the concretization of any is the set of 
all terms (including variables), and the concretization of _L is the empty set of 
terms. These symbols are non-terminals but they are considered terminals to the 
effect of regarding a grammar as deterministic. 

Let Q be the set of all grammars, if Ti, T2 belong to Q, the relation Ti = T2 ^ 
7(Ti) = 7(T2) is an equivalence relation. The quotient set G/ = is a complete 
lattice with top element any and bottom element ± based on the relation of 
containment, or type inclusion: for every Ti,T2 G G/ =■, Ti □ T2 <^ liTi) C 
7(T2). We will denote simply by Ti. 

The least upper bound is given by type union, (Ti U T2), and the greatest 
lower bound by type intersection, (Ti □ T2 ) |^ . It can be shown that intersection 
describes term unification: 

tl c 7(ri) A c 7(T2) A tie = t^e ^ {tioy c 7(ri n 

where t* denotes the set of ground terms which are instances of the term t. 
3 Abstract Domain for Type Inference 

In an abstract interpretation-based type analysis, a type is used as an abstract 
description of a set of terms. Given variables of interest {xi, . . . , x„}, any substi- 
tution 9 = \x\ ^ tl, . . . , Xn ^ tn} can be approximated by an abstract substi- 
tution {xi <— TIrj, . . . ,Xn <— Tx^} where ti £ ^{Tx-) and each type T^- G G/ =■ 
We will write abstract substitutions as tuples (Ti, . . . , r„), and sometimes also 
abbreviate a tuple simply as T". 

Concretization is lifted up to abstract substitutions straightforwardly, 

7((Ti, . . .,T„)) ^ { {xi ^ tl, . . . ,Xn ^ In} I ti e j{Ti) } 

as well as the equivalence relation =. Additionally, we consider a distinguished 
abstract substitution _L as a representative of any {Ti, . . . , r„) such that there 
is T, = _L. Of course, 7(±) = 0. 
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An ordering on the domain is obtained as the natural element- wise extension 
of the ordering on types: 

_L C T" 
(Ti,...,T„) g± 

m, . . . , T„} □ (T^ , . . . , T^) ^ Vi<i<„Ti C Tl 

The domain is a lattice with bottom element _L and top element (Ti, . . . ,Tn) 
such that Ti = . . . = Tn — any. The greatest lower bound and least upper 
bound domain operations are lifted also element-wise, as follows, 

_L u T" = T" U J_ = T" 
(Ti, . . . ,T„) U {Ti, ...,TI^) = {T,U Ti, . . . ,T„ UT^) 
J.nT" = T"nJ. = J. 
(Ti,...,T„) n (Ti',...,T^) = (TinTi',...,T„nT;) 

Using the adjoint a of 7 as abstraction function, it can be shown that 
(2^,a, J7,7) is a Galois insertion, where is the domain of concrete and f2 
that of abstract substitutions. The following abstract unification operator can 
be shown to approximate the concrete one. Let x = t he a concrete unification 
equation, with x a variable, i any term, and T" the current abstract substitution, 
and let j/j, j = 1, . . . , m be the variables of t, the new abstract substitution is: 

amgu{T^, x = t)= T^[T^/Z, TyJT^^^,. . . , T,^/T^^] (2) 

with each T replaced by T' in the tuple, = T^, n /;//,, /i = {yi ^ Ty^ JJm ^ 
Ty^^^}, and solve{t,T^) ~ {j/i = Ty^, . . . ,ym = Ty^}, a set of equations that 
define the types of the variables of a term t G 7(2^^), obtained as: 

{{t = T} if Ms a variable 

U U solve{ti,Ti) if t is f{h,...,tn) 

T ►/(Ti,...,T„) i=l,...,n 

In this abstract interpretation-based setting, analysis with a monotonic se- 
mantic function can be easily shown correct. However, it is not guaranteed to 
terminate, since H has infinite ascending chains. To guarantee termination, a 
widening operator is required. 

Example 2. The following program defines the type lists of lists of numbers: 

list_of _lists( [] ) . num_list( [] ) . 

list_of _lists ( [L I Ls] ) : - num_list ( [N I Xs] ) : - 
nuiii_list (L) , number(N), 
list_of _lists(Ls) . mim_list(Xs) . 

For the argument of nuiii_list, without a widening operator, an analysis would 
obtain the following first three approximations: 

To ^ [] Ti ^ I .(num,ro) T2 ^ [] | .(num,Ti) 
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where each Ti represents a hst of i numbers. Analysis wiU never terminate, since 
it would keep on obtaining a new type representing a list with one more number. 
A widening operator would be required that over-approximates some type Ti to 
something hke T; — > [] \ .(num,T;), which is the expected type, and allows 
termination of the analysis. 



4 Widenings 

Functor Widening This is probably the simplest widening operator which still 
keeps information from the recursive structure of the program that "produces" 
the corresponding terms. The idea behind it is to create a type and a produc- 
tion for each functor symbol in the original type. All arguments of the function 
symbols are replaced with the new types ^ . 

Example 3. Consider predicate list_of _lists of Example its argument 
should ideally have the following type: Tu — > [] | .{TuTu) Ti — » [] I -(num.Ti) 
but the functor widening will yield: T — > [] | num | .{T,T). 



Type Jungle Widening A type jungle is a grammar where each functor always 
has the same arguments. It was originally proposed as a finite type domain |^ , 
since in a domain where all grammars are of the type jungle class all ascending 
chains are finite. However, it can be used as a subdomain to provide a widening 

Example 4- Applying this widening to the previous type Tu, the following will 
be obtained: 

T [] I .{Ti,T) Ti I num | .{Ti,T) 

Note that this widening is strictly more precise than the functor widening. In 
the example, the new type captures the upper level of lists, but it loses precision 
when describing the type of the list elements. This is due to the restriction of 
forcing functors to always have the same arguments. 

Shortening A grammar can be seen as a graph where the nodes correspond to the 
non-terminals (or-nodes) and to the right hand sides of productions (and-nodes), 
and the edges correspond to the production relation or the relation between a 
functor and its arguments in a right hand side of a production. Given an or-node, 
its principal functors are the functors appearing in its children nodes. 

Example 5. The type Tu of the previous examples can be seen as the graph: 





Gallagher and de Warn M defined a widening wmch avoids having two or- 
nodes, which have the same principal functors, connected by a path. If two such 
nodes exist, they are replaced by their least upper bound. 
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Example 6. In the above example graph, nodes Tu and T; have the same principal 
functors ([] and .) so that they are replaced, yielding: 

T — ^01 -(^1,^) Ti — >W \ num | .(num,T) 

Note the precision improvement with respect to the result in the previous 
example. Note also that still the result is imprecise. 

Restricted Shortening Saglam and Gallagher |ic| ] propose a more precise variant 
of the previous widening. Shortening is restricted so that two or-nodes T and T' 
which are connected by a path from T to T' and have the same principal functors 
are replaced only if T' IZ T. If this is the case, only T' needs be replaced, since 
the least upper bound is T. 

Example 7. Continuing previous examples, since nodes Tu and T; have the same 
principal functors but Ti % Tu, the widening operation will make no change. In 
this case, the most precise type is achieved. 

Note, however, that restricted shortening does not guarantee termination in 
general (and thus, it is not, strictly speaking, a widening). There are cases in 
which analysis may not terminate using only this widening operator 

Depth Widening Janssens and Bruynooghe Q proposed a type analysis in which 
the widening effect is achieved by a "pruning" of the type depth up to a certain 
bound. A parameter k establishes the maximum number of occurrences of a func- 
tor in-depth in a type. The idea is similar to the well-known depth-k abstraction 
for term structure analysis. The resulting type analysis uses normal restricted 
type graphs, which are basically deterministic types satisfying the depth limit. 
Obviously, the precision depends on the value of the parameter k. 

Example 8. The widening of our previous type Tu with k=l will yield the same 
result than the functor widening (Example whereas with k=2 will yield the 
same result as restricted shortening (Example 

Topological Clash Widening Van Hentenryck et al. [pl| proposed the first widen- 
ing operator that takes into account two consecutive approximations to the type 
being inferred. After merging the two — i.e., calculating their least upper bound, 
the result is compared with the previous approximation to try to "guess" where 
the type is growing. This is done by locating topological clashes: functors that 
differ or appear at different depth in each type graph. The clashes are resolved 
by replacing them with the recently calculated least upper bound. 



Example 9. Consider the program: 

sorted( [] ) . 
sortedC [_X] ) . 

sortedC [X.YlL] ) :- X =< Y, sorted ( [Y I L] ) . 
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and the moment during analysis when the final widening is performed. The 
resulting type for the argument of sorted/1 is the one on the left below for the 
first two clauses, and the one on the right for the last one: 

To ^ [] I -(any, []) Ti .(num, .(num, T,)) 
Ti [] I .(num, TO 

Their least upper bound is T„ on the left below, which exhibits a clash with Tq 
in the second argument of functor ./2. Thus, the result of widening is T^: 

T„ ^ [] I .(any,Tz) T, ^ [] I •(any,T,) 

All widening operators are based on locating recursive structures in the type 
definitions where there are nodes with the same functors. This may originate an 
unnecessary loss of precision, since the widening may impose a recursive structure 
on the resulting type in argument positions where the concrete program is in 
fact not recursive. In the following section we present a new widening operator 
that tries to remedy this problem. 

5 Structural Type Widening 

In this section we define an extended domain for type analysis which incorpo- 
rates a widening operator aimed at improving the precision of the analysis. The 
domain is defined so as to keep track of information on the program structure, so 
that recursion on the types produced by the analysis is imposed by the widening 
operator only in the cases where it corresponds to a recursive structure in the 
program being analyzed. To this end, type names will be used. 

A type name is roughly a (distinguished) non-terminal that represents a type 
produced during the analysis. Type names are created for each variable in each 
argument of each variant of each program atom for each predicate (note how 
this is different from, for example, set-based analyses where variants are not 
taken into account). 

Type names provide information on how types are being formed from other 
types during analysis. This makes it possible to precisely identify places where 
to impose recursion on the types: in a subterm of the type which happens to 
refer to the name of that type. To this end, type names contain references to the 
position of its constituent types. To determine positions, selectors are used, as 
defined below. 

Definition 1 (selector). Define t/s, the subterm of a concrete term t refer- 
enced by a selector s, inductively as follows. The empty selector e refers to the 
term t, that is, t/ e — t. If t/s — t' , t' is a compound term f{ti, . . . , t;, . . . , t„) 
(where f is an n-ary function symbol) then t/s ■ (f.i) — ti, I < i < n. 

For every two selectors s, p, if t/s = t' and if t' /p exists then t/s ■ p = t' /p. 
The initial e of a non-empty selector will often be omitted, so e-p will be written 
simply as p. 

We define a set of type names TV such that TV n ^ = and a set 2"^^^ of 
relations X G 2"^^^ between type names and types, of the form X C J\f x Q. 
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Definition 2 (label). Let X a relation between type names and types. Given a 
type name N , a label of N is a tuple {s,N'), where s is a selector and N' is a 
type name, iff {N,T) £ X, {N',T') £ X, and T' □ T/s. 

Labels of a type name N indicate subterms of the type T defining N where 
other type names occur. 

Example 10. Let a relation X such that {(A,Ti), (i3,T2)} C X, and let gram- 
mars (Ti, Ti, JP, Til) and (T2, Tj, J^, 7^2), such that the only rule for Ti is (Ti — > 
fib)) £ 7^l and (T2 — > gic^Ts)) e 7^2, (T3 — >b\ f{b)) e 7^2. Consider a label 
{ig.2),A) of B. We have that Ti □ T2/(.g.2) = T3. 

Definition 3 (type descriptor). Let Q a set of types (regular term grammars) , 
A/" a set of type names, and X <Z J\f x Q . A type descriptor is a tuple (N, E, T) 
where N € Af, T e G, {N, T) G X, and E is a set of labels of N. 

In the new domain, type descriptors will be used instead of types. Let T) 
be the set of all type descriptors from given sets of types Q and of type names 
M. Concretization is defined as j{{N, E,T)) = j{T). The domain ordering and 
operations on V are the same as on G except for type names. In this case, they 
have to take into account the possible labels of the type name. 

Inclusion {Ni,Ei,Ti) □ (iV2, S2, T2) ^ Ti ^ T2 A Ei C E2. 

Union (iV, E, T) = {Ni,Ei,Ti) U (A^2, -B2, T2) ^ T = Ti U T2 A E = Ei U E2. 

Intersection {N,E,T) = {Ni, Ei,Ti)n{N2, E2,T2) <^ T = TinT2AE = E1UE2. 

Again, we may be interested in types defined by non-terminals other than 
the distinguished non-terminal T of a grammar (T, T, .F, TZ) . A type descriptor 
{N„E„Ti), where T, e T, is formally defined from {N,E,T) as follows: T, is 
the grammar of Equation |l|, Ni is a new type name, and 

E, = {{p, N') \{s-p,N')eEA T/s = TJ. 

Abstract substitutions for variables of interest {xi, . . . ,x„} are now defined 
as tuples of the form {{Ni, Ei,TxJ, . . . , (iV„, En, T^J)). Concretization and the 
domain ordering and operations are lifted to abstract substitutions element-wise, 
in the same way as in Section ^, including the widening operator defined below. 
If now Q is the domain of type descriptors, it can be shown that (2^, a, Q, 7) 
is a Galois insertion, where a is the adjoin of 7. Abstract unification is defined 
as in Equation ^, but using type descriptors instead of types (and preserving all 
type names in the "input" abstract substitution T" to amgu). 

Definition 4 (structural widening). The widening between an approxima- 
tion T2 to type name N and a previous approximation Ti to N is (N, Ei,Ti) y 
{N,E2,T2) = {N,Ei U E2,T), such that T is defined by {T,T,T,TV) where 
T = {Ti I T — >^ Ti}, and Ti, is obtained by the following algorithm: 
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T' := Ti U defined by (T', T', T, 7^') 
5 {s I (s,iV) e Ex\JE2\ 
Seen := 

for each (T' > /(^i, • • ■ , ^n)) G 7?.' add to TZ production 

T /(widen(Ai, 7^^ (/.I)), . . . , widen(A„, 7^^ (/.«))) 

v±den{N,n',Sel) : 

UN — any return any 

if 3M{N, M) G Seen return M 

let M a new non-terminal 

5*6671 := SeeniJ {{N,M)} 

for each {N — > /(^i, • • • , ^n)) G add to 7^ production 

M — > /(widen(Ai, 7^^ ^e^ • (/.I)), . . . , widen(A„, 7^^ Sel ■ (J.n))) 
if Sel G 5 then 

add to TZ production M — > T 
return AI 

Structural widening basically identifies subterms of the new type Ti U T2 
where a reference to the type N being widened appear, and makes this "self- 
reference" explicit in the definition of the new type. Note that the widening 
operation starts with the least upper bound and, basically, adds new grammar 
rules to that type. Therefore, the result is always a correct approximation of such 
an upper bound. This justifies its correctness. Moreover, this approach based 
on type names is potentially more precise than any of the previous widening 
operators discussed, as the following examples show: 

Example 11. Consider program sorted in Example A top-down analysis 
with topological clash was roughly described there. Let us now look at analysis 
using restricted shortening. The resulting type happens to be the same one. 

Analysis of program atom sorted ( [Y I L] ) approximates variable Y always as 
num, both in the calls and in the successes. The first two success approximations 
for variable L are [] and .(num, []). Their lub (and widening) is: 

Ti-^^a I .(num,[]) 

The next approximation to the type of L is .(num, Ti). Its lub with Ti is T2 — > 
[] I .(num,ri), and since T2 and Ti have the same functors, and Ti is included 
in T2, the widening of T2 is: 

T3 [] I .(num,T3) 

i.e., list of numbers. The next approximation to the type of L is .(num, T3) (i.e., 
a list with at least one number). It is included in T3, so fixpoint is reached. 

The success of principal goal sorted (X) is approximated after analyzing the 
two non-recursive clauses by T4 — > [] | .(any, []). Analysis of the third clause 
yields .(num, .(num, T3)). Its lub with T4 is T5 — > [] | .(any, T3). The widening 
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of T5 finds that T5 and T3 have the same functors and T3 C T5, since num C 
any. Thus, the result of widening is: 

Te^O I .(any,T6) 

i.e., Ust of terms. This is the final result after one more iteration. Note that the 
information about successes where the tail of lists of length greater than one is 
a list of numbers is lost. 

Let us now consider structural widening. Analysis of atom sorted ( [Y|L] ) 
always approximates the type of Y by (A^ia, 0, num). For variable L the two first 
approximations are {Nu, 0, []) and {N14, E14, .(num, [])), where the set of labels 
is Eu = {{' . M,7Vi3), (' . '.2,7Vi4) }. The result of widening is {Ni4,Ei4,Ti) 
where Ti is defined as: 

Ti ^ I .(num,Ti) 

i.e., list of numbers. This is the final result after one more iteration. 

The success of principal goal sorted(X) is approximated after analyzing the 
two non-recursive clauses by (A'^3,0,T2) where T2 — > [] | .(any, []). Analysis of 
the third clause yields {N3,Es, .(num, .(num, Ti))), where 

£3 = { ('•'.2- '.M,Afi3), ('.'.2- '.'.2,7Vi4) } 

Its widening with the previous approximation T2 is (A^3, E^jTs), where 

T3— .[] I .(any,ri) 

which amounts to their lub, since the widening operator does not produce any 
change, because N3 is not among its own labels. Therefore, the final result, after 
one more iteration, is T3, where indeed lists of length greater than one have a 
tail which is a list of numbers. 

However, structural widening does not guarantee termination. It is effective 
as long as the new approximation is built from the previous approximation of 
the type being inferred. This case is identified, in essence, by locating a reference 
to the type name of the previous approximation within the definition of the new 
one. However, there are contrived cases in which a type is constructed during 
analysis which loses the reference to the previous approximation. In these cases, 
a more restrictive widening has to be applied to guarantee termination. 

Example 12. Consider the program: 

inain:-p(a). p(a) . q(a,f(a)). 

p(X):- q(X,Y), p(Y) . q(f (Z) , f (L) ) : - q(Z,L). 

The calling substitution for atom p (Y) is the sequence 

T.^fia) T2-^/(/(a)) T3 ^ /(/(/(a))) ... 

whereas the type T — > /(a) | f{T) correctly describes such calls. However, the 
analysis is not able to infer such a type. 
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The problem in the above example is that none of the approximations Ti 
contains a reference to the previous approximation. This is originated in the 
program fact for predicate q/2 which causes the loss of the reference to the 
previous approximation because of the double occurrence of constant a. 

In our analysis, termination is guaranteed by a bound on the number of times 
the widening operation can be applied to a type name. A counter is associated to 
each type name, so that when the bound is reached a more restrictive widening 
that guarantees termination is applied. 

6 Analysis Results 

We have implemented analyses based on most of the widenings discussed in this 
paper, including structural widening. The implementation is in Prolog and has 
been incorporated to the CiaoPP system which uses the top-down analysis 
algorithm of PLAI. The analysis of j^, based on regular approximations, which 
uses a bottom-up algorithm, is also incorporated into the system. This analysis 
uses shortening. We want to compare the top-down and bottom-up approaches 
with the same widening and similar implementation technology,^ as well as the 
precision and efficiency, within the same analysis framework, of the widening 
operators previously discussed. 

We have used two sets of benchmark programs: the one used in the PLAI 
framework and that used in the GAIA |^ framework. A summary of the bench- 
marking follows. The analysis times in miliseconds are shown in Table | (left). 
The first column (rul) is for the regular approximation analysis and the other 
three for the PLAI-based analyses: column short for shortening, column clash 
for topological clash, and column struct for structural widening. 

Table || shows results in terms of precision. The precision of struct is never 
improved by any of the others. The improved precision of struct has been 
measured as follows. The left subcolumns under rul, short, and clash show 
the number of types with a more precise definition inferred by struct. The 
right subcolumns show the number of types where the previous ones appear 
(and are thus, also, more precise). The former are types directly inferred from 
program predicates; the latter are types which are defined from the former, due 
to the data flow in the program. 

The following conclusions can be drawn from the tables. First, the regular 
approximation approach seems to behave better in terms of efhciency than the 
program interpretation approach, at least for the bigger programs. This conclu- 
sion, however, has to be taken with some care, since the current implementation 
of rul performs some caching of the type grammars that the PLAI-based analy- 
sis does not. This should be subject of a more thorough evaluation, which is out 
of the scope of this paper. The fact that it improves in bigger programs seems 
to suggest that the effect of this caching is most surely not negligible. 

^ Similar in the programming technique. Of course, the regular approximation method 
is rather different from the method of program interpretation on an abstract domain: 
Evaluating this difference is part of the aim of the comparison. 
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rul 


shoi't 


cXash 


struct 




Program 


rul 


shor"t 


clcLsh 


StiTUCt 


aiakl 


568 


469 


529 


900 




aiakl 


697 


3009 


3738 


1409 


bid 


1480 


2209 


2529 


4730 




bid 


2899 


31278 


35949 


15259 


boyer 


3450 


3890 


4989 


9629 




boyer 


19620 


201169 


206917 


92117 


browse 


758 


380 


389 


539 




browse 


987 


2848 


2987 


1698 


cs_o 


3840 


1889 


2689 


2580 




cs_o 


11958 


17389 


32959 


4878 


cs_r 


18549 


10720 


24479 


19560 




cs_r 


50760 


303430 


238788 


30169 


disj_r 


4468 


1819 


6399 


2440 




disj_r 


6508 


18598 


26077 


6408 


gabriel 


1549 


1430 


1870 


1760 




gabriel 


2098 


13388 


22379 


5208 


grammar 


330 


160 


160 


190 




grammar 


759 


3169 


3169 


1279 


hanoiapp 


620 


719 


1889 


1150 




hanoiapp 


840 


3988 


13738 


3378 


kalah_r 


1520 


79 


79 


89 




kalah_r 


2069 


1187 


1188 


888 


mmatrix 


310 


190 


209 


119 




mmatrix 


757 


1769 


2078 


488 


occur 


380 


219 


330 


289 




occur 


530 


1647 


2628 


767 


palin 


590 


840 


980 


850 




palin 


997 


8520 


11878 


2180 


Pg 


839 


2020 


2980 


3990 




pg 


1349 


15380 


22870 


7370 


plan 


1138 


819 


960 


1009 




plan 


1587 


6167 


6559 


2288 


progeom 


979 


1840 


2530 


3640 




progeom 


1358 


12800 


17598 


6679 


qsort 


310 


590 


659 


680 




qsort 


520 


3439 


4168 


1409 


qsortapp 


369 


1000 


2898 


1210 




qsortapp 


569 


7789 


9669 


2900 


queens 


329 


179 


190 


180 




queens 


457 


1128 


1138 


429 


query 


720 


360 


370 


410 




query 


1627 


22458 


22788 


11818 


serialize 


478 


810 


969 


899 




serialize 


937 


8429 


11957 


2217 


witt 


2929 


4890 


1399 


1169 




witt 


3438 


188419 


42699 


25709 


zebra 


560 


3490 


14958 


12830 




zebra 


717 


55100 


189949 


44540 



(excluding simplification times) (including simplification times) 

Table 1. Timing results 



Regarding the analyses based on program interpretation, it can be concluded 
that the better the precision the worse the efficiency: short takes less than 
clash, and this one takes less than struct; this one is more precise than clash, 
which is more precise than short. This conclusion seems evident at first sight, 
but it is not: in analysis, an improvement in precision can very well trigger an 
improvement in efficiency. This can also be seen in the tables in some cases, the 
most significant probably being zebra. Overall, one can arguably conclude that 
the efficiency loss found is not a high price in exchange for the gain in precision. 

We have also carried out another test. For practical purposes, the CiaoPP 
system includes a back-end to the analysis that simplifies the types inferred, 
in the sense that equivalent types are identified, so that they are then reduced 
to a single type. This facilitates the interpretation of the output. It is the case 
that the structural widening includes certain amount of type simplification, so 
that the analysis creates less different types which are in fact equivalent. For 
this reason, we have included the same tests as above, but adding now the times 
taken in the back-end simplification phase. 

The times including the simplification are shown in table |l} The columns 
read as before. It can be seen that in this case structural widening outperforms 
all of the other analyses, except, in some cases, rul. It also can be observed that 
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Program 


rul 


short 


clash 


aiakl 


1 


1 


1 


1 






bid 


9 


12 


9 


12 






cs_o 


4 


18 


4 


18 


2 


9 


cs_r 


4 


28 


4 


28 


2 


19 


disjj: 


6 


13 


6 


13 






mmatrix 


2 


2 


2 


2 






occur 


1 


1 


1 


1 






palin 


2 


4 


2 


4 






Pg 


1 


1 


1 


1 






qsort 


1 


1 


1 


1 






serialize 


2 


4 


2 


4 






zebra 


3 


3 


3 


3 


1 


1 



Table 2. Precision results 



rul behaves usually better than short also when simplification is included. This 
seems to suggest that incorporating our widening into the regular approximation 
approach would probably give the best results in practice.^ 

7 Conclusions 

We have presented a new widening operator on regular types within an ab- 
stract interpretation-based characterization of type inference. The idea behind 
it is similar to set-based analyses in that we assign and fix type names, 
but it is applied here with more generality. It can be seen as a generalization 
of the idea of "guessing" the growth of the types during analysis which is be- 
hind . Instead of guessing, our technique determines exactly where the type 
is growing. The resulting widening operator has been presented on deterministic 
regular types. However, its extension to non-deterministic regular types should 
be straightforward. 

Our operator is more precise than previous approaches, but it is still efficient. 
This has been shown with (preliminary) practical results. However, it does not 
guarantee termination. We are currently working on the non-termination prob- 
lem. A moded type domain will help in this. The idea is to enhance abstract 
unification so that it is able to identify the "transference" of type names from 
the input to the output types, so that the names are not dropped. This will rem- 
edy the problem of Example |l^ and, hopefully, allow us to prove termination of 
analyses with the proposed widening operator. 

Finally, this work has revealed two issues that may be worth investigating 
for practical purposes: the impact on the efRciency of analysis of the different 
implementation techniques for different analysis methods, on one hand, and of 
the simplification of types, on the other hand. 



^ This, however, may not be trivial. It is subject for future work. 
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